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Abstract 

The eukaryotic Cys2His2 zinc finger proteins bind to DNA ubiquitously at highly conserved 
domains, responsible for gene regulation and the spatial organization of DNA. To study and un¬ 
derstand the zinc finger DNA-protein interaction, we use the extended ladder in the DNA model 
proposed by Zhu, Rasmussen, Balatsky &: Bishop (2007) [l|. Considering one single spinless elec¬ 
tron in each nucleotide vr-orbital along a double DNA chain (dDNA), we find a typical pattern 
for the bottom of the occupied molecular orbital (BOMO), highest occupied molecular orbital 
(HOMO) and lowest unoccupied orbital (LUMO) along the binding sites. We specifically looked 
at two members of zinc finger protein family: specificity protein 1 (SPl) and early grown response 
1 transcription factors (EGRl). When the valence band is filled, we find electrons in the purines 
along the nucleotide sequence, compatible with the electric charges of the binding amino acids in 
SPl and EGRl zinc finger. 
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INTRODUCTION 


One of the major eukaryotic DNA-protein binding motifs are those related to zinc hngers 
(ZF), the key protein family for the chromatin condensation as well as the gene regulation. 
There are around one thousand ZF encoding genes 21 and ten thousands highly conserved 
putative ZF biudiug sites aloug the humau genome ffl. The majotij of ZF proteins assist 
transcription factors, acting as repressors, activators and regulators [^, . They are respon¬ 

sible for the genome spacial structure in the DNA loops too, exposi^ or hiding the genes, 
and work as an insulator, avoiding the spread of heterochromatin j6|. These ZF proteins 
could mediate long-range chromosomal interactions in eukaryotic cells, > 100 thousand base 


pair (bp) 


7|-^. However, the exact relation between the long-ranged correlation in genomic 


scale nucleotide sequences 
ganization is still not clear 


>20 thousand bp) and the chromosomal three-dimensional or- 
15| and subject to intense research. Furthermore, since tran- 


10 


scription factors spot specihc sequences without the opening of the double helix, we expect 


some biological mechanism for probing nucleotide based on local properties 


16| . To under¬ 


stand this there are two basic approaches: a polymeric description and by electric charges. 



Since electrons play a crucial role in the DNA-protein interaction, we must consider the DNA 
from the electric charge distribution too. The electronic nature of DNA is still in debate. 


but the literature point us some cues about their organization. The double he 


isolants or conductor under silver deposition 23|], material contaminants 


environmental conditions 


M 


ices behave as 


25| and others 


26| . However, when the conductivity is measures in atmosphere. 


low vacuum or Tris-HCl buffers, DNA has semiconductor features with the typical gap be 


tween the valence and conductor band in the electronic density of states (DOS) [26l-l3l|. In 
order to describe this behavior, ionization models (also known as ballistic, polaron or wire- 


like charge transport) have been proposed 




The parameters in ionization models 


are easily measured, since one just needs to evaluate the loss of energy when we takes one 
electron in a neutral molecule. The lost electron is usually in the highest occupied molecular 
orbital valence band (HOMO) and it may easily jump to the lowest unoccupied molecular 
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orbital in the conductor band (LUMO). But, the literature also suggests electronic afhuity 


models, where the energy is described by the gain of electrons in neutral molecule 


3 - 41 |. 


These theoretical results usually combine density functional theory and molecular dynamic 
simulation. 


In 2007 Zhu et ah joined both molecular ionization and affinity approaches [l| . This adap¬ 
tation of the Peyra-Bishop DNA melting model 17| describes the nucleotide sequence from 
their semi-conductor features, avoiding the heavy computational cost of ab initio molecular 
dynamical simulations. Their work allowed to spot electronic local density of states (LDOS) 
in one viral P5 promoter sequence, connecting LDOS with one specific biological function 
16l |. Unfortunately, they did not observe in their model the gap between HOMO and 


in 

one 

ll, 

16 


LUMO in {C)n as one expects from the experimental data 30,1^- In our work we fix the 


problem of HOMO-LUMO gap, introducing the extended ladder in the model as suggested 


by 


36 


43 


45| . We also analyze systematically the DNA-protein binding sites for two tran¬ 


scription factor proteins: the human specificity protein transcription factor 1 (SPl) ^ and 
early grown response factor (EGRl, aka Zif268) both localized in the promoter of a 
great variety of genes and characterized by a molecular structure called Cys 2 His 2 zinc finger 
(ZF). The descriptions of ZFs as well as SPl and FGRl are in the appendix. 

This paper is organized as follows. First, we describe the extended ladder model. Then, 
we test the model, studying the electronic behavior of {C)n and (T)„ sequences. We replace 
one nucleotide in order to understand the electronic interactions along the DNA chain. After 
this, we analyze systematically the SPl and EGRl binding sequences and report strand 
dependence and independence results. 


THE MODEL 


In this paper, we consider one double DNA chain with n base pairs, totaling 2n nu¬ 
cleotides, Fig. [U^d). In reality our model does not consider nucleotides, but nucleosides, 
i.e. the nucleotide with the phosphate group. However, we will call nucleosides nucleotides 
in this work in order to simplify the nomenclature. The electronic behavior of the spinless 
free electron of the vr-orbital of the nucleotide is described using the same Hamiltonian as 
in [l|. 


H = He + He, + H,. 


( 1 ) 
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The first term in Eq. [T] is given by, 


2n 


n—1 


n—1 


He — ^iCjCi + t2i-l,2i+lC'2i-1^2i+l + ^2i,2i+2C'2i 

2 = 1 2=1 2=1 

n—1 n—1 

+ Y1 t2i-l,2iCli_^C2i + ^2i-2,2i+lC'2i-2^2i+l) + H 


2i+2 


( 2 ) 


2=1 


2 = 1 


where C\ and Ci are the electron creation and annihilation operators at site z, e* is the 
on-site ionization energy, n is the nnmber of nucleotides and tij is the electron hopping rate 
between nucleotides i and j. Here, we are using the extended ladder, where we duplicate 


the one dimensional lattice in |l| and include the interstrand hopping 


36 


43|, l4^. The 


structure of the ladder considers the long-distance charge and hole transport along dDNA 


m 


48 


5n| | The second term in Eq. [T] represents the coupling between the free electron and 


the nucleotide displacement field. 


2n 

Heb = (2) 

2 = 1 

where yt is the displacement (dark dotted line) of the electronic cloud from the equilibrium 
in the nucleotide (light dotted line). Fig. [11(d). The last term Hb represents the interaction 
of the electron with the nucleotide: 


Hb = - 1)2 + ^(1 + - ?/,_i)2], (4) 

2=1 

where Di and a* are parameters of the Morse potential, ky is the spring constant of the 
anharmonic interaction between two contiguous base-pairs, p and a are the parameters for 
modifying ky in order to evaluate long-range cooperative electronic behavior |l|. 

We study the electronic part Hf. and Hf>b of the Hamiltonian in Eq. [T] computing the 
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eigenvalue Ei^ and eigenvectors , i,k = 1,2n, of the matrix 


He+eb 


ei -|- a^yi 

tl,2 
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( 5 ) 


36, 


45[ |. except for the electron base component 


( 6 ) 


This matrix is similar to the one suggested in 
Heb. 

In order to estimate y*, we consider the self-consistency condition, given by 

oyi oyi 

where < ... > represent the average over the free electrons in the system. The iteration 
method for solving Eqs. [5] and [6] is described in [l| , and it consists of the follow procedure. 
Given a initial condition for {yi\, we diagonalize the matrix [5] in order to compute the 
electronic occupation in each site < rii >, where rii = X]fc=i ^^^1 rie is the number of 
electrons in the system. This set of < n* > will be used in the Langevin equation calculated 
from Eq. O We update the values of {yi}, using fourth-order Runger-Kutta method for 
the Langevin equation. The new {?/,} set is inserted again in the matrix of Eq. O We 
repeat the iteration until we achieve the minimum local adiabatic electronic and structural 
conhguration. The computation were done using R with the package deSolve for the Runger- 
Kutta algorithm j^l- The choices of the model parameters are in the appendix. 

In this work, we estimate spatial distribution of electrons, energy level and displacement 
held only considering Uf, = n. Thus, the valence band is always hlled with electrons and the 
conductor band is empty. Our model does not have periodic boundary condition. So, the 
elected regions for analysis must be larger in order to avoid boundary effects. We analyze 
only nucleotide sequences at least lObp distant from the beginning and ending of the sample. 

We apply the proposed model in poly(C)-poly(G) and poly(T)-poly(A) sequences with 
63 base pairs in order to understand the behavior of the electrons dispersed along the DNA 
chain. 
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According to the literature 3g, 1^, [4^ 1^, we do expect a gap in the energy band in the 


test sequences {C)q 3 and (T )63 as can be seen in Fig. I2^a). The gap between the valence 
and conductor band in (T )63 is narrower than in (C) 63 . Furthermore, the gap of the pure 
(C )63 sequence can be modulated, when we introduce one single T in the position 32. One 
HOMO and LUMO appear in the gap of the energy band, marked as H and L in Fig. [2](d). 
Moreover, the HOMO and LUMO electronic cloud, dispersed in pure ( 0 ) 03 , black lines in 
Fig. [2lb,e), becomes localized in the introduced T (red lines in Fig. [2](b,e)). We remark that 
the electronic cloud of HOMO is dispersed in a pure ( 0 ) 03 . Thus, thymines and adenines 
are related with LUMOs and cytosine and guanines are linked with the localization of the 
bottom of occupied molecular orbital (HOMO). 

On the other hand, when we substitute one thymine by cytosine in a the HOMO 

electronic cloud will be localized in the replaced nucleotide. Fig. [2](b). Moreover, the 
eigenvalue related to this electronic state remains in the valence band with values 8.05±0.01 
eV, G in Fig. [2](d). Furthermore, the electronic distribution of HOMO will be positioned 
over the cytosine too. The CG rich domains usually are related to HOMO. We do not observe 
any alterations in the conductor band for a (T )63 with and without the replacement. 


THE ELECTRONIC DENSITY OF STATE IN SPl AND EGRl TRANSCRIP¬ 
TION FACTOR 


We apply the procedure described in the previous section and we estimate the eigenvalues 
Ek and eigenvectors (j)^ of the total Hamiltonian in Eq. [1] for the sequences in Table 2. 
The criteria for the sequence selection as well as the method for nucleotide alignments are 



sequence in the literature: 5’-ggggcgggg-3’ 

SPl and EGRl, respectively. 

Fig. [3] shows a typical set of results for the SPl binding site of the gene MOAB and 
EGRl binding site of the gene EGRl. The nucleotide sequence of MOAB SPl is in reverse 
complementary reading direction and EGRl is in complementary strand. Fig. |3](g). 

Although we have 2n eigenvalues and eigenvectors, each one related with one of 2n 
nucleotides of the system, the relevant electrons for the binding sites are those linked with 
BOMO, HOMO and LUMO, respectively noted as G, H and L in the density of states 
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Fig|2Ia,b). 

We start with the analysis of the position of BOMOs looking for the values of 
with k close to 1. When we consider n = 50 as in MOAB and EGRl, the analysis of the 
hrst 8 eigenvectors are usually sufficient to spot the relevant ones. The electronic cloud n*, 
0 < rij < 1, Eq. [H is strand dependent, but we do not observe any strand related pattern for 
individual electrons. Thus, we sum the probabilities of the direct and the complementary 
strands to hnd the local electronic cloud. BOMOs could be degenerated in many electrons 
along the nucleotide sequence, but we should focus just in those around the binding sites, 
yellow and black lines in the valence band Figjni^c,d). Note that the sum of these 

two degenerated BOMOs — -Ffc) will result in the LDOS of the binding site. 


which is proportional to the differential tunneling conductance [l|. At low temperature, 
this quantity could be measured by scanning tunneling microscope (STM) The zinc 

fingers of SPl and EGRl transcription factors act as tips of an STM, scanning binding sites 
along the DNA chain. Finally, we mark the nucleotides with at least 10% of probability of 
electronic presence in gray and yellow in Figjn](g,h). 

The procedure for localizing the electronic cloud associated with HOMO and LUMO is 
very similar to spot BOMO probability distributions, except that k of HOMO and LUMO 
are close n. In order to hnd the electronic clouds, we need to consider k from 46 to 50 for 
HOMO and 51 to 54 for LUMO, when n = 50. The electronic cloud associated with LUMO 
is always close to the HOMO, with a maximum of ±6bp distance. Since the probability 
of Ending one HOMO or LUMO electrons are strand independent, we add both direct and 
complementary strand in Fi^31(c-f). The orange lines in Fig. [21(c,d) and red lines in 
Fig. [3](e,f) are HOMO and LUMO, respectively. We can also measure the LDOS of HOMO 
and LUMO with STM, using the same approach for BOMOs. The nucleotides with at least 
10% of probability of electronic presence is denoted by orange and red boxes in Fig. |3](g,h). 

Now we return to Table 2, where all BOMOs are marked in gray and yellow as well as the 
HOMO and LUMO electrons are in orange and red boxes. Looking Table 2, the electronic 
distribution patterns for the binding sites for SPl and EGRl transcriptor factors are clear. 

In the case of SPl, BOMO clouds are over the hrst (5’-ggg-3’) and third triplets (5’-ggg- 
3’) of the consensus sequence, light green in Table 2. These triplets are related with the 
hrst and third ZF binding positions of SPl. Moreover, the hrst BOMO electronic cloud 
has values from 4 to 5 bp, while the second ranges from 2 to 4bp. The eigenvalue of these 



BOMOs values 7.98±0.05eV. The energy level of HOMO electrons are fixed at 8.52±0.02eV 
and the electronic cloud sizes spans between 1 and 2 bp. We observe some fluctuation in 
the eigenvalue in LUMO for SPl, which values 9.3 ± O.leV. The LUMO electrons envelop 2 
to 5 base pairs. The positions of HOMO and LUMO associated electrons are always before 
the first electron and these electrons are placed from -12 to 1. 

For the EGRl the first BOMO spans from the position 3 to 7 over the second triplet 
(5’-ggg-3’), and the probability in Ending this particular electron spans over 2 or 4 bp. The 
second triplet is the binding site of the second ZF of the early grown response protein 1. The 
second electron is after the second triplet and is dispersed between the nucleotide position 7 
to 15, covering the third triplet. The electronic cloud size ranges from 2 to 4bp. All BOMO 
energies in Table 2 values 7.99 ± 0.03eV. The HOMO and LUMO electronic cloud is over 
the second electron. All in Table 2 valne 8.52 ± O.OleV and the HOMO related 

electronic clouds have a length of 1 or 2 base pair. The LUMO energies fluctuate with an 
average value of 9.4 ± O.leV. The LUMO electronic cloud sizes vary from 1 to 6bp and they 
are in the position from 10 to 20. 

Considering the HOMO and LUMO distributions, we believe that they may play some 
role in SPl and EGRl binding. These proteins bind DNA, embracing the major grove of the 
double helix as guide. In the case of SPl, the head may interact with nucleotides between 
position -11 to -1. The behavior for EGRl is more elusive because HOMO and LUMO are 
completely disperse over the nucleotides 5 to 20. Despite the description emphasizing the 
similarity between the ZF and nucleotide interaction in the literature over the consensus 


61 


nucleotides j^, the mechanisms of protein attachment in EGR 1 and SPl are not the same 


6 ^. 


The HOMO and LUMO electronic clouds are frequently overlapped, and the main reason 
for electrons of HOMO and LUMO are always in adenine and thymine rich sequences is 
as follows. The electrons from the highest occupied molecular orbital in the valence band 
should move to the nearby lowest unoccupied molecular orbital in the conductor band, when 
the system is disturbed. And the easiest way for this movement is placing the electron in 
regions with higher excitability, i.e. AT rich domains. We may conjecture that this jump of 
the electron in the HOMO to the LUMO has a unknown role in the transcriptor factor SPl 
and EGRl. 

On the other hand, the less mobile electrons are those in the GG rich domain, since 
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they are at the bottom of the DOS. So, we expect to spot BOMOs in CG rich-sequences 
instead of AT rich-regions as we see for {T)^^ with one C in the position 32, described in 
the previously. Furthermore, these BOMOs are degenerated, i.e. all electrons present the 
same energy level. Thus, these cytosine and guanine rich-regions, typical in promoters, are 
ideal landmarks for SPl and EGRl binding sites. The absence o excitation in the lowest 
states is vital for ZF transcription factors, because nucleotides with mobile electrons may 
change the position of the beginning of the gene reading, altering the gene expression. We 
are aware about the those eigenvalues between BOMO and HOMO, but we still do not find 
any obvious pattern associated with SPl and EGRl. 

We never observe overposition between the probabilities of BOMOs and the overlapped 
LUMO-HOMO electronic clouds using the criteria of a minimum 10% of the localization 
probability of one particular electron at the samples in Table 2. 


THE COLLECTIVE ELECTRONIC BEHAVIOR 


The electronic probabilities p of individual electrons, discussed in the previous section, 
are strand independent. However, the collective electronic probabilities n* and the held 
displacement yi depend of the strand. 

When we have rig = n electrons, we hll only the valence band and usually observe in all 
analyzed sequences. Table 2, 100% of probability of electronic presence in purine (adenine 


or guanine) and the absence of an e 
in agreement with the DFT analysis 


ectron (hole) in pyrimidines (thymine or cytosine) 


4lj |. Figs. [3](i,j) are the probabilities n* associated 
with Ending one electron in one nucleotide for the MAOB SPld and EGRl binding site 
sequences. The electronic presence in each purine gives us a new biological interpretation 
of Peng et als. contribution [m, 12|. Using exon and intron rich segments of the eukaryotic 
genome, they construct a DNA walk using purine and pyrimidine as criteria for steps. Then 
they report self-afhne fractality in the walk, showing long-ranged correlation in the purine 
and pyrimidine distribution. When we look to Figs. [3](i,j), purines and pyrimidines rehect 
the electronic distribution along the DNA chain. This electronic distribution is related to 
BOMO, HOMO and LUMO distributions, which work as ZF binding sites, for example. It 
is important to stress that they report a self-afhne fractal, but not a self-similar fractal. The 
self-similarity is related to the palindromic sequences, connected with DNA-loop structures 
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as tRNA and rRNA 


63 . 641, while self-affinity is related to introns [n|. Furthermore, 
the evidence of polynomial long-ranged nucleotide interaction is also supported by Q,Q. 


In this work the long contiguous sequences are represented by a sequence of 0 and 1 for 
noncoding and coding nucleotides, where noncoding nucleotides are intergenic regions and 
introns and coding nucleotides are genes and regions for metabolic controls. We made an 
auto-correlation analysis over the binary sequence and report correlation between two coding 
nucleotides at least 20 thousand bp distance apart. 

The existence of long-ranged correlations has another consequence in the model. The 
second term in Eq. 01 the stacking interaction between adjacent base pair, mimics the 
bending of DNA as polymer. But, we will see that the short-ranged in Eq. 

0] does not contribute to the electronic behavior. This term has an energetic value of the 
order of 10“^ eV, when we consider typical values for the parameters: yi ~ —0. lA, p 10 
and a ~ 0.35A. On the other hand, the Morse Potential is of the order of 10“^eV. The 
stacking interaction will be relevant only if we consider p > 100, but such high experimental 

values for p is unlike. This short-ranged element of the model comes from the DNA melting 

□ 

problem, where the interstrands binding of the double helix may open [17[ . In this case, the 
short-ranged element is important, since it is easier to open the dDNA when the neighbor 
bp is already open. Moreover, pi represent the displacement held of the electronic cloud of 
the hydrogen bonds between nucleotides in the DNA melting model. But, we change the 
concept of pi to the vr-orbital of the nucleotide. So, the short-ranged part in Eq. 0] is not 
longer relevant. In order to simplify the model, one may suggest to eliminate the harmonic 
oscillator too in Eq. 01 However, the harmonic oscillator is important for describing the 
stacking interaction in the Langevin equation, EqEl On the other hand, the bending and 
the torsion of the double chain have inhuence over the DNA chain but this behavior 


cannot be explained by Eq. 01 because we have just short-range exponential interactions and 
a harmonic oscillator between two neighbor base pairs. The missing long-ranged element in 
Eq. 0]is object of further research. Finally, we remind that we do not observe the presence 
of the electron in purine sequences with one replaced pyrimidine or vice versa: (Tjes with 
one C* in i = 32 or (Cjes with one T in i = 32. The presence or absence of electrons depend 
of neighbor base pairs. 

The presence of electrons in purines have a profound impact in the ZF binding. We show 
the consensus nucleotide sequence and the core zinc finger binding amino acids in FigHJ^b) 
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and (c) for EGRl and SPl, respectively. 

The EGRl amino acid sequence is available in the Universal Protein Resource databank 
(UniProt) with the accession code P18146 6^. The three zinc hngers of the human EGRl 
are positioned between position 338 to 362, 368 to 390 and 396 to 418 of the 543 long amino 
acid sequence The dotted lines in Fig. [U^b) are the hydrogen bonds between aspartic 
acid (D) and adenine or cytosine, which stabilize the hrst guanine-argine(R) hydrogen bond 
of ZF. The positive charged basic argine (R), histidine (H) and lysine (K) responsible for 
the DNA-protein are highlighted in gray, while the negative charged weak acid threonine 
(T) and strong acid glutamine (E) are in yellow. Each red line in Fig. [Hb) is the binding 
of one particular nucleotide with its respective opposite charged amino acid of the core zinc 
huger segment of the EGRl 61 |. 

The 785 amino acid long SPl transcriptor factor, UniProt accession code P08047, has 


three tandem ZFs between 626 to 650, 656 to 680 and 686 to 708 4^, l66|. The dotted lines 


are the hydrogen bond that stabilize the hrst ZF guanine-argine(R) or guanine-lysine(K) 
bonds. The binding between core ZF amino acids and the correspondent nucleotides are 
indicated by red lines in FigHI^c) j^. Again, each nucleotide is connected with opposite 
charged amino acid . 

Goncerning the electrical charges of the SPl zinc huger tips, the middle amino acid 
that bonds with the middle nucleotide of the triplet, there is one motif associated with the 
position of BOMOs, described in the previous section. The pattern of positive and negative 
charges along the nucleotide sequence coincide with the position of the three ZF tips. Since 
BOMOs are the most stable electrons, they guide the hngers as holder for hxing SPl to the 
dDNA. We observe the same phenomenon for the EGRl. 

When we compare the strand independence analysis of the previous section with the 
electronic strand dependence, one may suggest the existence of a contradiction between the 
presence of BOMO in the complementary strand 3’-ccc-5’ at EGRl in Fig. |3]J^h). Actually 
BOMO in this case is at the direct strand 5’-ggg-3’. We have the impression that BOMO 
is in the 3’-ccc-5’, since we sum the electronic cloud of direct and complementary strand in 
the previous section, seeking the electronic motif of BOMO. 

The collective probabilities rii are not the unique strand dependent variable in SPl and 
EGRl. The held displacement yi of the Morse potential is also strand dependent. This 
means the electronic cloud yi, given by the Morse potential in Eq. 01 usually contract when 
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rii = 1. i.e. in the presence of purines. The contraction of the electronic cloud is more intense 
in adenine (y* = —0.125 ± O.OOlA) than guanine {ni = —0.114 ± O.OOlA). The simultaneous 
measurement of the size of the electronic cloud of the direct and complementary strands yi 
mirror the nucleotide order and may lead to a new sequencing method. 

The consensus sequences, the light and dark green lines in Figs. [3]J^g,h), reflect in and 
Ui, Figs. [3](i-l). We usually observe the absence of the electronic cloud in the middle cytosine 
of the direct strand of the SPl and EGRl binding sites, black lines with circle in Figs. |3t^i,k), 
as well as the opposite behavior in the complementary strand, red lines with plus in Figs. 
[3Ii,k). But, we should be cautious about, because this is not true for the purine sequences 
with one replaced pyrimidine or vice versa in the same way of n*, as mentioned before. 


CONCLUSION 

In the extended ladder model proposed in this article, the Morse potential is the key 
components for the electronic behavior in the double helix DNA chain. But, the stacking 
interaction between adjacent base pairs in the Zhu et ah |l| has limited influence on the 
results, since this interaction is short-ranged. 

BOMO, HOMO and LUMO show an electronic motif behind the SPl and EGRl binding 
sites, compatible with the consensus multiple alignments. In the case of SPl, there is 
one BOMO in the hrst and another in the third zinc huger binding site, and the HOMO 
and LUMO positions are before the consensus sequence. The hrst BOMO is distributed 
for EGRl over the second zinc huger binding position and the second BOMO is after the 
consensus sequence. The HOMO and LUMO are over the second BOMO. BOMOs are 
degenerated with 7.98 ± 0.05eV and 7.99 ± O.OSeV for SPl and EGRl, respectively. The 
HOMO eigenvalues are 8.52±0.02eV (SPl) and 8.52±0.01eV (EGRl). The LUMO energy 
levels are 9.3 ± O.leV (SPl) and 9.4 ± O.leV (EGRl). 

When the valence band is hlled, we observe a 100% probability in electronic presence 
in purines (adenine and guanine) and its absence in pyrimidines (thynine and cytosine). 
Furthermore, the sequence of electrons and holes coincide with the basicity and acidity 
of the DNA-protein binding animo acids in the zinc hngers. In particular, the sequence 
of positive and negative charges of the tips of SPl and EGRl coincide with BOMO cloud 
distribution. Finally, the collective electronic behavior for the hlled valence band DNA chain 
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will result in a sequence of electronic clouds around purine 7r-orbitals, dashed-dotted lines 
in Fig. [Hd). 


ACKNOWLEDGMENTS 

The authors wish to thank Lei Liu for the discussions about zinc hngers and for kindly 
providing us Fig. [U^a). This work is supported by Conselho Nacional de Desenvolvimento 
Cientffico e Tecnologico (CNPq), process number 248589/2013, Brazil. 


[1] J.-X. Zhu, K. 0. Rasmussen, A. V. Balatsky and A. R. Bishop, J. Phys.: Condens. Matter 
19, 136203 (2007). 

[2] S. V. Razin, V. V. Borunova, O. G. Maksimenko and O. L. Kantidze, Biochemistry 77, 217 

( 2012 ). 

[3] T. H. Kim, Z. K. Abdullaev, A. D. Smith, K.A. Ching, D. I. Loukinov, R. D. Green, M. Q. 
Zhang, V. V. Lobanenkov and B. Ren, Cell 128, 1231 (2007). 

[4] H. Chen, Y. Tian, W. Shu, X. Bo and S. Wang, PLoS ONE 7, e41374 (2012). 

[5] A. Klug, Ann. Rev. Biochem. 79, 213 (2010). 

[6] C.-T. Ong and V. G. Corces, Nature 15, 234 (2014). 

[7] J. Q. Ling, T. Li, Ji F. Hu, T. H. Vu, H. L. Chen, X. W. Qiu, A. M. Cherry and A. R. 

Hoffman, Science 312, 269 (2006). 

[8] T. M. Yusufzai, H. Tagami, Y. Nakatani &: G. Felsenfeld, Molecular Cell 13, 291 (2004). 

[9] S. Kurukuti, V. K. Tiwari, G. Tavoosidana, E. Pugacheva, A. Murrell, Z. Zhao, V. Loba¬ 

nenkov, W. Reik and R. Ohlsson, Proc. Natl. Acad. Sci. USA 103, 10684 (2006). 

[10] C. J. Feinauer, A. Hofmann, S. Goldt, L. Liu, G. Mate and D. W. Heermann, Advances in 
Protein Chemistry and Structural Biology 90, 67 (2013). 

[11] C.-K. Peng, S. V. Buldyrev, A. L. Goldberger, S. Havlin, F. Sciortino, M. Simons and H. E. 
Stanley, Nature 356, 168 (1992). 

[12] C.-K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, H. E. Stanley and A. L. Goldberger, Phys. 
Rev. E 49, 1685 (1994). 

[13] N. N. Oiwa and C. Goldman, Phys. Rev. Lett. 85, 2396 (2000). 


14 



[14] N. N. Oiwa and C. Goldman, Cell Biochemistry and Biophysics 42,145 (2005). 

[15] D.W. Heermann, Cnrrent Opinion in Cell Biology 23, 332 (2011). 

[16] N. N. Oiwa, J. Physics: Condens. Matter 19, 181002 ( 2007). 

[17] M. Peyrard and A. R. Bishop, Phys. Rev. Lett. 62, 2755 (1989). 

[18] E. Macia, Phys. Rev. B 76, 245123 (2007). 

[19] G. Weber, Nncl. Acids Res. 41, e30 (2013). 

[20] J. Langowski and D. W. Heermann, Seminars in Cell & Developmental Biology 18, 659 (2007). 

[21] M. Bohn, D. W. Heermann and R. van Driel, Phys. Rev. E 76, 051805 (2007). 

[22] L. Liu and D. W. Heermann, J. Phys.: Condens. Matter 27, 064107 (2015). 

[23] E. Braun, Y. Eichen, U. Sivan and G. Ben-Yoseph, Nature 391, 775 (1998). 

[24] H.-W. Eink and C. Schonenberger, Nature 398, 407 (1999). 

[25] P. J. de Pablo, F. Moreno-Herrero, J. Colchero, J. Gomez Herrero, P. Herrero, A. M. Barb, 
P. Ordejon, J. M. Soler and E. Artacho, Phys. Rev. Lett. 85, 4992 (2000). 

[26] M. Taniguchi and T. Kawai, Physica E 33, 1 (2006). 

[27] P. Tran, B. Alavi and G. Gruner, Phys. Rev. Lett. 85, 1564 (2000). 

[28] L. Cai, H. Tabata and T. Kawai, App. Phys. Lett. 77, 3105 (2000). 

[29] D. Porath, A. Bezryadin, S. de Vries and C. Dekker, Nature 403, 635 (2000). 

[30] K.-H. Yoo, D.H. Ha, J.-O. Lee, J.W. Park, J. Kim, J. J. Kim, H.-Y. Lee, T. Kawai and Han 
Yong Choi, Phys. Rev. Lett. 87, 198102 (2001). 

[31] H.-Y. Lee, H. Tanaka, Y. Otsuka, K.-H. Yoo, J.-O Lee, T. Kawai, App. Phys. Lett. 80, 1670 

( 2002 ). 

[32] H. Yamada, Int. J. Mod. Phys. B 18, 1697 (2004). 

[33] B. P. W. Oliveira, E. L. Albuquerque and M.S. Vasconcelos, Surface Science 600, 3770 (2006). 

[34] R. G. Sarmento, E. L. Albuquerque, P. D. Sesion Jr. , U. L. Fulco and B. P. W. Oliveira, 

Phys. Lett. A 373, 1486 (2009). 

[35] R. Venkatramani, S. Keinan, A. Balaeff and D.N. Beratan, Coordination Chemistry Review 
255, 635 (2011). 

[36] R. G. Sarmento, G. A. Mendes, E. L. Albuquerque, U. L. Fulco, M. S. Vanconcelos, O. Ujsaghy, 
V. N. Freire and E. W. S. Catetano, Phys. Lett. A 376, 2413 (2012). 

[37] G. Lu, P. Maragakis and E. Kaxira, Nano Letters 5, 897 (2005). 

[38] E. C. M. Chen and E. S. Chen, Chem. Phys. Lett. 435, 331 (2007). 


15 


[39] E. S. Chen and E. C. M. Chen, Molecular Simulation 35, 719 (2009). 

[40] N. A. Richardson, J. Gu, S. Wang, Y. Xie and H. E. Schaefer III, J. Am. Chem. Soc. 126, 
4404 (2004). 

[41] J. Gu, J. Leszczynski and H. E. Schaefer III, Chemical Reviews 112, 5603 (2012). 

[42] E. Shapir, H. Cohen, A. Calzolari, C. Cavazzoni, D. A. Ryndyk, G. Cuniberti, A. Koltlyar, 
R. di Eelipe and D. Porath, Nature Materials 7, 68 (2008). 

[43] K. Senthilkumar, E. C. Grozema, C. E. Guerra, E. M. Bickelhaupt, E. D. Lewis, Y. A. Berlin, 
M. A. Ratner and L. D. A. Siebbeles, J. Am. Chem. Soc. 127, 14894 (2005). 

[44] H. Mehrez and M. P. Anantram, Phys. Rev. B 71, 115405 (2005). 

[45] M. Zilly, O. Ujsaghy and D. E. Wolf, Phys. Rev. B 82, 125125 (2010). 

[46] J. Kaczynski, T. Cook and R. Urrutia, Genome Biology 4, 206 (2003). 

[47] G. Thiel and G. Cibelli, Journal of Gellular Physiology 193, 287 (2002) 

[48] J. Jortner, M. Bixon, T. Langenbacher and M. E. Michel-Beyerle, Proc. Natl. Acad. Sci. USA 
95, 12759 (1998). 

[49] B. Giese, Accounts of Ghemical Research 33, 631 (2000). 

[50] M. Bixon and J. Jortner, Ghemical Physics 281, 393 (2002). 

[51] K. Soetaert, T. Petzoldt and R. W. Setzer, Journal of Statistical Software 33, 1 (2010). 

[52] H. Wang, J. P. Lewis and O. P. Sankey, Phys. Rev. Lett. 93, 016401 (2004). 

[53] C. Skerka, E. L. Decker and P. E. Zipfel, J. Biol.Ghem. 270, 22500 (1995). 

[54] J. Yao, N. Mackman, T. S. Edgington and S.-T. Ean, J. Biol. Ghem. 282, 17795 (1997) 

[55] W. K. Wong, K. Ghen and J.G. Shih, Molecular Pharmacology 59, 852 (2002) 

[56] A. lavarone and J. Massague, Molecular and Cellular Biology 19, 916 (1999) 

[57] J. R. Schultz, L.N.Petz and A. M. Nardulli, Molecular and Cellular Endocrinology 201, 165 
(2003). 

[58] C. Zhang, D.-J. Shin and T. E. Osborne, Biochem. J., 386 161 (2005). 

[59] D. Petersohn, S. Schoch, D.R.Brinkmann and G. Thiel, J. Biol. Ghem. 270, 24361 (1995). 

[60] D. Mechtcheriakova, A. Wlachos, H. Holzmiiller, B. R. Binder and E. Hofer, Blood 93, 3811 
(1999). 

[61] S. A. Wolfe, L. Nekludova and C. O. Pabo, Annu. Rev. Biophys. Biomol. Struct. 3, 183 (1999). 

[62] R. T. Nolle, R. M. Conlin, S. C. Harrison and R. S. Brown, Proc. Natl, Acad. Sci. USA 95, 
2938 (1998). 


16 


[63] N. N. Oiwa and J. A. Glazier, Physica A 331, 221 (2002). 

[64] N. N. Oiwa and J. A. Glazier, Gell Biochemistry and Biophysics 41, 41 (2004) 

[65] T. Dauxois, M. Peyrard, and A. R. Bishop, Phys. Rev. E 47, 684 (1993) 

[66] The UniProt Consortium, Nucleic Acids Research 42, D191 (2014). 

[67] N. P. Pavletich and C. O. Pabo, Science 252, 809 (1991). 

[68] M. Elrod-Erickson, M. A. Rould, L. Nekludova and C. O. Pabo, Structure 4, 1171 (1996). 

[69] M. Ghomi, R. Letellier, J. Liquier and E. Taillandier, Int. J. Biochem 22, 691 (1990). 

[70] D. A. Benson, K. Clark, I. Karsch-Mizrachi,D. J. Lipman, J. Ostell and E. W. Sayers, Nucl. 
Acids Res. 42, D32-7 (2014). 

[71] E. S. Lander et ah. Nature 409, 860 (2001). 

[72] R. C. PCTier, T. Junier and P. Bucher, Nucl. Acids Res. 26, 353 (1998). 

[73] V. Praz, R. Prier, C. Bonnard and P. Bucher, Nucl. Acids Res. 30, 322 (2002). 

[74] R. Dreos, G. Ambrosini, R. C. Perier and P. Bucher, Nucl. Acids Res. 41, D157 (2013). 

[75] H. McWilliam, W. Li, M. Uludag, S. Squizzato, Y.M. Park, N.Buso, A. O. Cowley and R. 
Lopez, Nucl. Acids Res. 41, W597 (2013). 

[76] D. Charif and J. R. Lobry, 207-232, in U. Bastolla, M. Porto, H. E. Roman and M. Vendrus- 
colo. Structural approaches to sequence evolution: Molecules, networks, populations (Springer 
Verlag, New York, 2007) 

[77] H. Pages, P. Aboyoun, R. Gentleman and S. DebRoy, Biostrings, version 2.34.1 (2015). 

[78] M. Morgan, S. Anders, M. Lawrence, P. Aboyoun, H. Pages and R. Gentleman, Bioinformatics 
25, 2607 (2009). 


17 


Appendix A: The zinc-fingers in SPl and EGRl 


The Cys 2 
one a-helix 


3 is2 zinc finger unit is composed by one zinc ion between two /3-sheet and 


6 l| . Fig. [U^a). There are two cysteine at one end of the /3-sheet and two 


histidines in the C-terminal portion of a-helix 


6 l| . Each ZP bonds with three nucleotides, 


Fig. [T] (b,c). Transcription factors use typically between 2 to 4 ZFs for identifying specific 
sites along the DNA In particular, the EGRl interact with 1-3ZF, fitting itself in 

the major groove of the dDNA, Fig. [I](a). The interaction in situ of each Cys 
finger of the mouse EGRl with the double DNA chain (dDNA) is detailed in 


diso zinc 


m 


The 


EGRl gene is a nuclear protein, related to the cell differentiation and mitogenisis and is 
localized in the position 5q31.1 of the human cyto gen etic map. SPl acts in a large number 


of GG rich promoters dispersed along the genome. 


4fil |. This ZF transcription factor acts as 


enhancer and histone deacetylase binder, increasing the strength of the chromatin wrapping 
by histones, and modulates DNA-binding specificity. The gene responsible for SPl is in 


12ql3.1. Here we focus on transcription factors with few zinc fingers as 
ZF as transcription the mechanism of binding is not well-understood jh 


or those with many 


221 , 


61|. 


Appendix B: The model parameter choices 


fl 


When we transform the uni-dimensional Zhu et al. model [1[ in extended ladder, each 
site will represent one nucleotide, instead of a base pair. We also alter the concept of the 


displacement field yi 36|, [4^. The displacement field in the DNA melting model is the ratio 


of the electronic cloud from the hydrogen bridge between two nucleotides 


fl. 


But, now 


the displacement field will be the variation of the electronic cloud ratio of the nucleotide vr 


or 


Dital. So, according to the density functional theory (DFT) studies in the nucleotides 


40 


4l[ |. Daj Dt, Dq and Dq in the Morse potential Hi,, Eq. 01 are respectively 0.25eV, 0.44eV, 


0.33eV and 0.45eV. The DFT analysis, supported by experimental anion photoelectron 
spectra, also suggests that oa, ut, dc and an in the Morse potential are correspondingly 


around 3.0A 3.0A 3.0A ^ and 2.5A 


spectra using Raman spectroscopy 


1-1 


38 


3 - 


The kv comes from DNA vibrational 


69| and we fix this to 0.0125eV. The electronic hopping 


rates in the free electron Hamiltonian Hg, Eq. [21 are the same in the literature 3a, l43l-l45|. 


Thus, we use Table 1 for h'-YY'-b', H'-xy'-y and ty_YX'- 5 '- The interstrand 
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hopping rate tcuc values 0.055 eV and t^z/r is fixed to 0.047 eV. The values for the on¬ 
site potential e^, e^, and eg are respectively 8.631eV, 9.464eV, 9.722eV and 8.178eV. 
In the electron-base Hamiltonian i7e6, Eq. [3l we may control the gap size between HOMO 
and LUMO varying Thus, we compare the gap in our spectra with those reported in 


literature 


36 


42 


45 


52| . and we fix to 1.0. 


Appendix C: Selection of GenBank files and nucleotide alignment 


We use the DNA sequence from the human reference map, annotation release 106 (build 
GRCh38) 70|. The criteria for selecting the binding sites in this work are as follows. The 
binding sites must have experimental confirmation in vitro. We remark that we usually 
observe many single nucleotide polymorphisms (SNPs) between the reference map and the 
reported experimental samples, because the reference map is basically a consensus sequence 


from 9 individuals 


7l| while the samples in experimental binding site data belongs to one 


individual. Nested binding sites is a common occurrence, but we try to avoid overpositioned 
binding sites in order to simplify the search for an electronic motif. The binding site of 
these transcrition factors is in the promoter, a region between 500 to 2000bp distant from 
the beginning of the gene. We spot similar SPl and EGRl binding sites, TATA box and 
other structures reported in individual samples in the GenBank reference map as well as in 


databanks as in the Eukaryotic Promoter Database for SPl and EGRl 


72 


7^. Finally, we 


use the nucleotides sequences in FASTA and GenBank fiat file format, since the nucleotides 
are just nucleotides with the phosphate group as we mentioned in the introduction. 

We select 16 binding sites in 10 different genes, see Table 2. We remark that the IL2 
3 and TNF genes have binding sites for both SPl and EGRl. The selected binding 


site regions of the human genome have 50bp of length as MOAB SP 
TNF SPl. We consider segments 70bp long around GDG25A SPl 


p,d 55|, IL2 SPl and 


and PGR SPlp,d 


57l |. The segment around the three SP 


binding sites a, b and c of the gene SREBFl 


We extract the sequences with 50bp of length 
.43, SYN2 y and TP53 (aka P53) 


(also known as SREBP-la) has 90 bp 
for the EGRl binding sites for the genes EGRl^ 
but 80bp long for VEGEA EGRl binding site 
finger protein, because this gene has also a EGRl binding site j^. The EGRl protein can 
regulate its own expression. The sequences of IL2 and TNF EGRl binding sites are the 


60| |. By the way, EGRl is a curious zinc 
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same 50bp long as IL2 and TNF S 


SeqinR 3.1-3 


76|, Biostrings 2.34.1 


1. To identify the binding regions we use BLAST 75| . 


77| and ShortRead 1.24.0 78|, since the genome has four 


different readings: the normal direction (5' — XX' — 3' in Fig. [T]) , reverse (3' — X'X — 5'), 
in the complementary strand (3' — YY' — 5') and in the reverse-complementary direction 
(5' — Y'Y — 3'). We indicate respectively these reading direction with ’r’ and ’c’ in parenthesis 
for reverse and complementary in Table 2. 

Once we have selected the region of interest, we made multiple sequence alignment using 
Clustal Omega, a software that organize the sequences using hidden Markovian model for 
alignments and a multidimensional embedding space for data clustering 751. 


We adopt the same notation for the nucleotide positioning in the promoter 


56 


59 1. So, 


there is no position zero. The counting always starts with 1 in the hrst nucleotide of the 
consensus sequence. The hrst nucleotide before the nucleotide of the consensus sequence is 
always -1. This way there is no position zero for nucleotides. We adopt the position 1 as 
the hrst nucleotide of the literature consensus sequence for SPl and EGRl. In Table 2, the 
consensus sequences are in light and dark green and they are dehned by a simple majority, 
i.e. when the number of alignment nucleotides is bigger than or equal to 6 and 4 for SPl 
and EGRl, respectively. 
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FIGURE LEGENDS 



h'-XY-S' = 

= ty-YX-Y 

ty-XY-b' 

ty-XY-yS' 



Y 



Y 



Y 


A 

G 

A 

c 

T 

G 

A 

C 

T 

G 

A 

G 

T 

G 

0.053 

-0.077 

-0.114 

0.141 

0.012 

-0.013 

0.002 

-0.009 

-0.032 

-0.011 

0.022 

-0.014 

A 

-0.010 

-0.004 

0.042 

-0.063 

-0.013 

0.031 

-0.001 

0.007 

-0.011 

0.049 

0.017 

0.007 

C 

0.009 

-0.002 

0.022 

-0.055 

0.002 

-0.001 

0.001 

0.0003 

0.022 

0.017 

0.010 

-0.004 

T 

0.018 

-0.031 

-0.028 

0.180 

-0.009 

0.007 

0.0003 

0.001 

-0.014 

-0.007 

-0.004 

0.006 


TABLE I. Hopping rates in eV for the extended ladder model reported in 


43,l4^. 
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MOAB SPld (rc)[54] 
MOAB SPlp (rc)[54] 
CDC25A SPlp [55] 
SREBFl SPla (r)[57] 
SREBFl SPlb (rc)[57] 
SREBFl SPlc (rc)[57] 
PGR SPld (rc)[56] 
PGR SPlp (rc)[56] 

IL2 SPl (r)[52] 

TNF SPl (c)[53] 

consensus 



g g g g c g [g g cgct 


g g g gcg g g g c 
IZF 2ZF 3ZF 


-1 1 


EGRl EGRl (c)[41] 
SYN2 EGRl (c)[58] 
TP53 EGRl (r)[41] 
VEGFA EGRl [59] 


c g g g a 
a c c c a 
t c t t c 
t c c c g 


gcgggggc gla 
gcg g g g g fa^g^ g 
gcg gla tig c g[a 
g c g g g g c g gla 


g[r|c c c a g a c c 
a c g |t t| c g g g g 
gggggatggc 
gccatgcgcc 


IL2 EGRl (r) [52] 

aaaca taggggt[ggg 

g g 

a ajt 

t t 

11 1| 

TNF EGRl (c)[53] 

aggtt taggggcggg 

g g 

c g c 

t a 

c c t 
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c 

c 

t 

c 

t 

c 


consensus 


g C g g g g g 
IZF 2ZF 3ZF 


Table 2. Nucleotide alignment for SPl and EGRl. The reading direction in reverse and 
complementary strands are respectively indicated with r and c in the parentheses. Nu¬ 
cleotides with at least 10% of probability of electronic presence in the bottom of the occu¬ 
pied molecular orbitals (BOMOs), > 0.1, are in gray and yellow. The nucleotides with 
^ 0.1 for HOMO and LUMO are respectively indicated by orange and red bordered 
boxes. The consensus sequence is the simple majority (alignment nucleotides is > 6 and 
4 respectively for SPl and EGRl). The three zinc hnger binding sites for SPl and EGRl 


(1-3ZF) are indicated in light and dark green 


47 

7 LiiJJ? 


611. 
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FIG. 1. (a) Spatial structure of the three EGRl zinc fingers (blue) embracing the DNA major 

grove (orange). The zinc ions are in black and the DNA-protein bindings of the second zinc finger 


m 


are in red. (b) and (cWre the DNA binding sites and amino acid sequence in the three EGRl 
and SPl zinc fingers ^ (IZE to 3ZE). Solid red lines indicate the binding between one particular 
nucleotide and its correspondent amino acid. The dotted lines in (a), (b) and (c) are hydrogen 
bonds that stabilize the hrst G-R or G-K bonds in each zinc hnger. The nucleotides in yellow are 
those with 100 % probability of electronic presence, when the valence band is filled, Ug = n. The 
negative charged amino acids with weak (threonine, T) or strong acid property (glutamic acid, E) 
are indicated in yellow too. The positive charged basic argine (R), histidine (H) and lysine (K) 
as well as protein-binding cytosines are indicated in gray, (d) The diagram for the DNA extended 
ladder model. The light dotted line is the electronic equilibrium radius for the Morse potential. 
The dark dotted lines is the field displacemeiS^yi. The dashed-dotted are the purines (adenine 
and guanine) electronic clouds with n* = 1.0 and Ug = n. The dashed lines are the interstrand 


electronic hopping. The solid lines are the sugar phosphate backbones. 


















(C)« (T)« 



FIG. 2. (a) The electronic density of states (DOS) for (C )63 in black lines and (r )63 in red lines, 

(d) the same as in (a), except that the sequence has one C or T in i = 32. In (d) BOMO for (T )63 
with one replaced C in the position 32 is pointed as G, and the HOMO and LUMO energetic level 
for {C)q 3 with T in i = 32 are respectively indicated by H and L. The electronic cloud for HOMO 
|^H0M0|2 LUMO p(e) for (0)63 (black lines) and the same sequence with T in 

z = 32 (red lines). The electronic cloud for HOMO (b) and BOMO (f) for (r )63 

(black lines) and the same sequence with C in position 32 (red lines). 
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MOABSPld EGRl 



FIG. 3. The results of SPld and EGRl binding sites for MOAB 


55l | and EGRl genes 


are 


respectively in the left and right columns. We remark that the EGRl transcription factor can bind 


in the promoter of his own gene [T^. (a) and (b) are the density of states (DOS), where BOMO, 

HOMO and LUMO energy level of the zinc finger binding site are indicated by G, H and L. (c) and 

(d) are the probability |(/>f P of BOMO (dashed and solid black lines) and HOMO electrons (orange 

line). BOMO Eq in the valence band are degenerated and value correspondingly 8.00 ± O.OleV 

and 7.98 ± 0.02 eV for SPld and EGRl. P'HOMO ^ O.OleV in both (c) and (d). The 

electronic clouds |(/>f P of LUMO in the conductor band are in (e) with Elumo ~ ^ O.OleV 

and (f) with E^umO ~ ^ O.OleV. The nucleotide sequences are given in (g) and (h), where 

we underline the 1-3ZE binding consensus sequence in light and dark green lines g, l47|, l53l-l6ll| . We 

remark that the MOAB SPld is in reverse complementary direction, the EGRl reading sequence 

is in the complementary strand. The nucleotides with at least 10% probability of finding BOMO 

electrons are in gray and yellow. The HOMO and LUMO nucleotides with |(/>f P > 0.1 are in orange 

and marked with red bordered boxes, respectively, (i) and (j) are the probability for the electronic 
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presence in the direct strand (black) and the complementary strand (red), when the valence band 


is completely filled, Ug = n. (k) and (1) are the field displacements y, in the Morse potential with 
He = n for the direct strand (black) and for the complementary strand (red). 























































